Forward-Backward Convolutional LSTM for Acoustic Modeling

نویسندگان

  • Shigeki Karita
  • Atsunori Ogawa
  • Marc Delcroix
  • Tomohiro Nakatani
چکیده

An automatic speech recognition (ASR) performance has greatly improved with the introduction of convolutional neural network (CNN) or long-short term memory (LSTM) for acoustic modeling. Recently, a convolutional LSTM (CLSTM) has been proposed to directly use convolution operation within the LSTM blocks and combine the advantages of both CNN and LSTM structures into a single architecture. This paper presents the first attempt to use CLSTMs for acoustic modeling. In addition, we propose a new forwardbackward architecture to exploit long-term left/right context efficiently. The proposed scheme combines forward and backward LSTMs at different time points of an utterance with the aim of modeling long term frame invariant information such as speaker characteristics, channel etc. Furthermore, the proposed forward-backward architecture can be trained with truncated back-propagation-through-time unlike conventional bidirectional LSTM (BLSTM) architectures. Therefore, we are able to train deeply stacked CLSTM acoustic models, which is practically challenging with conventional BLSTMs. Experimental results show that both CLSTM and forward-backward LSTM improve word error rates significantly compared to standard CNN and LSTM architectures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Design of Robust Deep Models for CHiME-4 Multi-Channel Speech Recognition with Multiple Configurations of Array Microphones

We design a novel deep learning framework for multi-channel speech recognition in two aspects. First, for the front-end, an iterative mask estimation (IME) approach based on deep learning is presented to improve the beamforming approach based on the conventional complex Gaussian mixture model (CGMM). Second, for the back-end, deep convolutional neural networks (DCNNs), with augmentation of both...

متن کامل

Attention Based CLDNNs for Short-Duration Acoustic Scene Classification

Recently, neural networks with deep architecture have been widely applied to acoustic scene classification. Both Convolutional Neural Networks (CNNs) and Long Short-Term Memory Networks (LSTMs) have shown improvements over fully connected Deep Neural Networks (DNNs). Motivated by the fact that CNNs, LSTMs and DNNs are complimentary in their modeling capability, we apply the CLDNNs (Convolutiona...

متن کامل

Deep Learning for Query Semantic Domains Classification

6 Long Short Term Memory (LSTM), a type of recurrent neural network, has 7 been widely used for Language Model. One of the application is speech query 8 domain classification where LSTM is shown to be more effective than 9 traditional statistic models and feedforward neural networks. Different from 10 speech queries, text queries to search engines are usually shorter and lack of 11 correct gram...

متن کامل

Layerwise Interweaving Convolutional LSTM

A deep network structure is formed with LSTM layer and convolutional layer interweaves with each other. The Layerwise Interweaving Convolutional LSTM(LIC-LSTM) enhanced the feature extraction ability of LSTM stack and is capable for versatile sequential data modeling. Its unique network structure allows it to extract higher level features with sequential information involved. Experiment results...

متن کامل

Long short-term memory recurrent neural network architectures for large scale acoustic modeling

Long Short-Term Memory (LSTM) is a specific recurrent neural network (RNN) architecture that was designed to model temporal sequences and their long-range dependencies more accurately than conventional RNNs. In this paper, we explore LSTM RNN architectures for large scale acoustic modeling in speech recognition. We recently showed that LSTM RNNs are more effective than DNNs and conventional RNN...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017